Analysing the Limit Order Book

In this notebook, we investigate some of the empirical properties of limit order books (LOBs). This will motivate the models that we choose to use for representing the trading environment that different market participants face.

Level 1 data

Only the data from the touch, i.e. the best bid price $P^b_t$, the best ask price $P^a_t$ and their associated volumes $V^b_t$ and $V^b_t$. This is very noisy and so is not really suitable for building a very sophisiticated trading strategy off.

Level 2 data

This gives the order book in its entirety for all levels. In practice you won't use all levels to determine a strategy as the importance of prices and volume drops off the further away you are from the touch.

Level 3 data

This consists of orderbook data as well as details on the dynamics of the orderbook. We will see below an example of this data (provided from NASDAQ). In general it is very expensive to obtain, however for academics it is possible to access this quality of data from LOBSTER. This is still expensive, but not anywhere near as expensive as for institutions. In particular, data for the last couple of days is omitted so that one cannot actively trade off it.

LOBSTER is an limit order book data tool to provide easy-to-use, high-quality limit order book data.

Since 2013 LOBSTER acts as a data provider for the academic community, giving access to reconstructed limit order book data for the entire universe of NASDAQ traded stocks.

More recently, it has started to make the data available on a commercial basis.

Visualising the Limit Order Book

Visualising the dynamics of the LOB

Prices

What are the dynamics of the mid- and microprices like?

Investigating the distribution of returns over 1 second

Note that as the time increases, we start to look more like a Gaussian distribution. There is only one day of data but, if we had more, we would see this pattern continues to hours, days, weeks, etc.

Autocorrelation of returns

Note that the returns over the previous second has no clear influence over what will happen over the next second. However, the absolute returns (a measure of volatility) have a persistent nature.

Interarrival times

Note that even with a log y-axis, it doesn't seem like we have a linear relationship. We check this by plotting the Q-Q plot with the exponential distibution.

Idea: fit a power law

Fill probabilities

Relationship between distance to touch and fill probability

Price impact

To be added